An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels

نویسندگان

Julian R. H. Mariño

Willian M. P. Reis

Levi Lelis

چکیده

There are several approaches in the literature for automatically generating Infinite Mario Bros levels. The evaluation of such approaches is often performed solely with computational metrics such as leniency and linearity. While these metrics are important for an initial exploratory evaluation of the content generated, it is not clear whether they are able to capture the player’s perception of the content generated. In this paper we evaluate several of the commonly used computational metrics. Namely, we perform a systematic user study with procedural content generation systems and compare the insights gained from our user study with those gained from analyzing the computational metric values. The results of our experiment suggest that current computational metrics should not be used in lieu of user studies for evaluating content generated by computer programs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Taxonomy and the Metrics: What to Study When and Why; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”

Dukhanin and colleagues’ taxonomy of metrics for patient engagement at the organizational and system levels has great potential for supporting more careful and useful evaluations of this ever-growing phenomenon. This commentary highlights the central importance to the taxonomy of metrics assessing the extent of meaningful participation in decision-making by patients, consumers and community mem...

متن کامل

Generating Maps Using Markov Chains

In this paper we outline a method of procedurally generating maps using Markov Chains. Our method attempts to learn what makes a “good” map from a set of given human-authored maps, and then uses those learned patterns to generate new maps. We present an empirical evaluation using the game Super Mario Bros., showing encouraging results.

متن کامل

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review

Background Patient, public, consumer, and community (P2C2) engagement in organization-, community-, and systemlevel healthcare decision-making is increasing globally, but its formal evaluation remains challenging. To define a taxonomy of possible P2C2 engagement metrics and compare existing evaluation tools against this taxonomy, we conducted a systematic review. Methods A broad search strate...

متن کامل

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

An Empirical Evaluation of Evaluation Metrics of Procedurally Generated Mario Levels

نویسندگان

چکیده

منابع مشابه

Using the Taxonomy and the Metrics: What to Study When and Why; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”

Generating Maps Using Markov Chains

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

عنوان ژورنال:

اشتراک گذاری